Breaking the Small Cluster Barrier of Graph Clustering Supplementary Material
نویسندگان
چکیده
In this supplementary material, we present proof details. 1 Notation and Conventions We use the following notation and conventions throughout the supplement. For a real n× n matrix M , we use the unadorned norm ‖M‖ to denote its spectral norm. The notation ‖M‖F refers to the Frobenius norm, ‖M‖1 is ∑ i,j |M(i, j)| and ‖M‖∞ is maxij |M(i, j)|. We will also study operators on the space of matrices. To distinguish them from the matrices studied in this work, we will simply call these objects “operators”, and will denote them using a calligraphic font, e.g. P. The norm ‖P‖ of an operator is defined as ‖P‖ = sup M :‖M‖F =1 ‖PM‖F , where the supremum is over matrices M . For a fixed, real n× n matrix M , we define the matrix linear subspace T (M) as follows: T (M) := {YM +MX : X,Y ∈ Rn×n} . In words, this subspace is the set of matrices spanned by matrices each row of which is in the row space of M , and matrices each column of which is in the column space of M . For any given subspace of matrices S ⊆ Rn×n, we let PS denote the orthogonal projection onto S with respect to the the inner product 〈X,Y 〉 = ∑n i,j=1X(i, j)Y (i, j) = trX Y . This means that for any matrix M , PSM = argminX∈S ‖M −X‖F . For a matrix M , we let Γ(M) denote the set of matrices supported on a subset of the support of M . Note that for any matrix X, (PΓ(X)M)(i, j) = { M(i, j) X(i, j) 6= 0 0 otherwise . It is a well known fact that PT (X) is given as follows: PT (X)M = PC(X)M +MPR(X) − PC(X)MPR(X) ,
منابع مشابه
Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملGraph-based clustering for finding distant relationships in a large set of protein sequences
MOTIVATION Clustering of protein sequences is widely used for the functional characterization of proteins. However, it is still not easy to cluster distantly-related proteins, which have only regional similarity among their sequences. It is therefore necessary to develop an algorithm for clustering such distantly-related proteins. RESULTS We have developed a time and space efficient clusterin...
متن کاملFinding Community Base on Web Graph Clustering
Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...
متن کاملBreaking the Small Cluster Barrier of Graph Clustering
This paper investigates graph clustering in the planted cluster model in the presence of small clusters. Traditional results dictate that for an algorithm to provably correctly recover the clusters, all clusters must be sufficiently large (in particular, Ω̃( √ n) where n is the number of nodes of the graph). We show that this is not really a restriction: by a more refined analysis of the trace-n...
متن کاملخوشهبندی دادهها بر پایه شناسایی کلید
Clustering has been one of the main building blocks in the fields of machine learning and computer vision. Given a pair-wise distance measure, it is challenging to find a proper way to identify a subset of representative exemplars and its associated cluster structures. Recent trend on big data analysis poses a more demanding requirement on new clustering algorithm to be both scalable and accura...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013